Hypotheses

Previous research (Cohen, 2015) has argued that Paradigmatic Enhancement is the result of competition between representations of contextually viable alternatives. According to Cohen, the paradigmatically related alternatives are stored as phonetically detailed exemplar representations. In a contextually non-deterministic context, exemplars of all alternatives are activated, influencing the pronunciation of the produced form. In other words, the phonetic enhancement of paradigmatically supported forms reflects the lack of reduction due to interference from the pronunciation of the alternative forms. Given this account, we would only expect to see paradigmatic enhancement if inflected representations actually play an important role during production. For Dutch plural inflections, the relative frequency between plural and singular forms has been argued to reflect the degree to which Dutch plurals are composed (by rule or analogy) or represented. Our own model of variable plural distribution seems to support this claim:

load("varPluralData.RData")
library(knitr)
library(aods3)

distribution_model = aodml(cbind(f_s, f_nons) ~ p_s*log_freq_pl + p_s*prop_pl, var)

plot_aodml_effect(var, distribution_model, predictor_var = "p_s", moderator_var = "prop_pl", 
                  constant_vars = c("log_freq_pl"), dependent_var = "prop_s", 
                  predictor_lab = "Probability(-s)", moderator_lab = "Proportion(PL)", 
                  dependent_lab = "Proportion(PL)", moderator_values = "0-0.5-1")

The plot above shows that, when the singular is more frequent than the plural (at low Proportion(PL)), the proportion of the -s variant (Proportion(-s)) can be predicted based on the phonological features of the singular (represented by Probability(-s)). However, when the plural is more frequent than the singular (at high Proportion(PL)), phonological generalization does not work very well as a predictor. Presumeably, this is due to strong representations of the plural forms that resist the phonological pressures.

Given these results our hypothesis is that a higher Proportion(-s) should only result in a longer duration of -s if Proportion(PL) is high.

Initial Analysis

Below we model log(Duration(-s)) as a function of significant covariates and the interaction between Proportion(-s) and Proportion(PL).

library(knitr)
library(lmerTest)
duration_model = lmer(log_s_dur ~ speech_rate_pron_sc +
                        PC1_sc + PC2_sc + PC3_sc +
                        next_phon_class +
                        register +
                        prop_s*prop_pl +
                        (1 | speaker) + (1 | word),
                      data = s_dur)

s_dur$dur_resid = resid(duration_model)
s_dur_trim = s_dur[abs(scale(s_dur$dur_resid)) < 2.5,]

duration_model_trim = lmer(log_s_dur ~ speech_rate_pron_sc +
                             PC1_sc + PC2_sc + PC3_sc +
                             next_phon_class +
                             register +
                             prop_s*prop_pl +
                             (1 | speaker) + (1 | word),
                           data = s_dur_trim)

library(sjPlot)
plot_model(duration_model_trim, type = "eff", terms = c("prop_s", "prop_pl[0, 0.5, 1]"), colors = "bw", legend.title = "Proportion(PL)", title = "", axis.title = c("Proportion(-s)", "log(duration(-s))"))

The plot above shows that we do find paradigmatic enhancement but only if Proportion(PL) is high. This is in line with our hypothesis. However, the plot also seems to show a reduction effect when Proportion(PL) is low.

Double checking the interaction

Let’s find out whether either the apparent paradigmatic enhancement or the reduction is due to collinear predictors or non-linear relations between the variables.

First, let’s see if either the reduction or the enhancement effect is only apparent by checking whether quadratic predictors improve the model:

duration_model_quad = lmer(log_s_dur ~ speech_rate_pron_sc +
                             PC1_sc + PC2_sc + PC3_sc +
                             next_phon_class +
                             register +
                             poly(prop_s,2)*poly(prop_pl,2) +
                             (1 | speaker) + (1 | word),
                           data = s_dur)
kable(as.matrix(summary(duration_model_quad)$coefficients), caption = "Coefficients")
Coefficients
Estimate Std. Error df t value Pr(>|t|)
(Intercept) -2.6698144 0.0347143 193.70893 -76.9081670 0.0000000
speech_rate_pron_sc -0.1063218 0.0169115 549.39564 -6.2869589 0.0000000
PC1_sc 0.0371866 0.0152224 572.24011 2.4428905 0.0148715
PC2_sc 0.0351433 0.0148204 570.07292 2.3712744 0.0180582
PC3_sc 0.0322952 0.0150026 577.43301 2.1526316 0.0317613
next_phon_classAPP -0.1672756 0.0803680 575.32980 -2.0813697 0.0378417
next_phon_classF 0.0229725 0.0463662 572.72214 0.4954568 0.6204678
next_phon_classL -0.0910898 0.1373295 570.64801 -0.6632937 0.5074103
next_phon_classN -0.0633337 0.0780012 571.67854 -0.8119579 0.4171538
next_phon_classP -0.0562494 0.0643604 575.02132 -0.8739758 0.3824963
next_phon_classSIL 0.5136385 0.0387770 575.24111 13.2459479 0.0000000
registerstories 0.1307360 0.0374479 299.25033 3.4911438 0.0005534
registernews -0.0236922 0.0659327 77.05438 -0.3593391 0.7203243
poly(prop_s, 2)1 -0.5065052 0.4736805 53.89692 -1.0692973 0.2897014
poly(prop_s, 2)2 0.2306023 0.4831137 84.63228 0.4773252 0.6343610
poly(prop_pl, 2)1 -0.1750697 0.5185962 34.89452 -0.3375838 0.7376992
poly(prop_pl, 2)2 -0.0644631 0.5155372 38.97858 -0.1250407 0.9011340
poly(prop_s, 2)1:poly(prop_pl, 2)1 29.5196937 12.3277534 41.48201 2.3945721 0.0212382
poly(prop_s, 2)2:poly(prop_pl, 2)1 13.3373930 13.6540548 104.49544 0.9768082 0.3309201
poly(prop_s, 2)1:poly(prop_pl, 2)2 -0.5241158 12.7705669 47.35373 -0.0410409 0.9674358
poly(prop_s, 2)2:poly(prop_pl, 2)2 7.1908778 13.3990392 71.17055 0.5366711 0.5931685

As you can see none of the possible combinations of quadratic predictors improve the model. Only poly(prop_s, 2)1:poly(prop_pl, 2)1, which represents the interaction in which both predictors are linear, is significant.

Now, let’s check the associations between all variables in the model:

library(rcompanion)
library(corrplot)
pred_ass = matrix(c(cramerV(table(s_dur[,c("next_phon_class", "next_phon_class")]), bias.correct = TRUE),
                   cramerV(table(s_dur[,c("next_phon_class", "register")]), bias.correct = TRUE),
                   sqrt(summary(lm(speech_rate_pron_sc ~ next_phon_class, data = s_dur))$r.squared),
                   sqrt(summary(lm(PC1_sc ~ next_phon_class, data = s_dur))$r.squared),
                   sqrt(summary(lm(PC2_sc ~ next_phon_class, data = s_dur))$r.squared),
                   sqrt(summary(lm(PC3_sc ~ next_phon_class, data = s_dur))$r.squared),
                   sqrt(summary(lm(prop_s ~ next_phon_class, data = s_dur))$r.squared),
                   sqrt(summary(lm(prop_pl ~ next_phon_class, data = s_dur))$r.squared),
                   sqrt(summary(lm(log_s_dur ~ next_phon_class, data = s_dur))$r.squared),
                   cramerV(table(s_dur[,c("register", "next_phon_class")]), bias.correct = TRUE),
                   cramerV(table(s_dur[,c("register", "register")]), bias.correct = TRUE),
                   sqrt(summary(lm(speech_rate_pron_sc ~ register, data = s_dur))$r.squared),
                   sqrt(summary(lm(PC1_sc ~ register, data = s_dur))$r.squared),
                   sqrt(summary(lm(PC2_sc ~ register, data = s_dur))$r.squared),
                   sqrt(summary(lm(PC3_sc ~ register, data = s_dur))$r.squared),
                   sqrt(summary(lm(prop_s ~ register, data = s_dur))$r.squared),
                   sqrt(summary(lm(prop_pl ~ register, data = s_dur))$r.squared),
                   sqrt(summary(lm(log_s_dur ~ register, data = s_dur))$r.squared),
                   sqrt(summary(lm(speech_rate_pron_sc ~ next_phon_class, data = s_dur))$r.squared),
                   sqrt(summary(lm(speech_rate_pron_sc ~ register, data = s_dur))$r.squared),
                   cor(s_dur$speech_rate_pron_sc, s_dur$speech_rate_pron_sc),
                   cor(s_dur$speech_rate_pron_sc, s_dur$PC1_sc),
                   cor(s_dur$speech_rate_pron_sc, s_dur$PC2_sc),
                   cor(s_dur$speech_rate_pron_sc, s_dur$PC3_sc),
                   cor(s_dur$speech_rate_pron_sc, s_dur$prop_s),
                   cor(s_dur$speech_rate_pron_sc, s_dur$prop_pl),
                   cor(s_dur$speech_rate_pron_sc, s_dur$log_s_dur),
                   sqrt(summary(lm(PC1_sc ~ next_phon_class, data = s_dur))$r.squared),
                   sqrt(summary(lm(PC1_sc ~ register, data = s_dur))$r.squared),
                   cor(s_dur$PC1_sc, s_dur$speech_rate_pron_sc),
                   cor(s_dur$PC1_sc, s_dur$PC1_sc),
                   cor(s_dur$PC1_sc, s_dur$PC2_sc),
                   cor(s_dur$PC1_sc, s_dur$PC3_sc),
                   cor(s_dur$PC1_sc, s_dur$prop_s),
                   cor(s_dur$PC1_sc, s_dur$prop_pl),
                   cor(s_dur$PC1_sc, s_dur$log_s_dur),
                   sqrt(summary(lm(PC2_sc ~ next_phon_class, data = s_dur))$r.squared),
                   sqrt(summary(lm(PC2_sc ~ register, data = s_dur))$r.squared),
                   cor(s_dur$PC2_sc, s_dur$speech_rate_pron_sc),
                   cor(s_dur$PC2_sc, s_dur$PC1_sc),
                   cor(s_dur$PC2_sc, s_dur$PC2_sc),
                   cor(s_dur$PC2_sc, s_dur$PC3_sc),
                   cor(s_dur$PC2_sc, s_dur$prop_s),
                   cor(s_dur$PC2_sc, s_dur$prop_pl),
                   cor(s_dur$PC2_sc, s_dur$log_s_dur),
                   sqrt(summary(lm(PC3_sc ~ next_phon_class, data = s_dur))$r.squared),
                   sqrt(summary(lm(PC3_sc ~ register, data = s_dur))$r.squared),
                   cor(s_dur$PC3_sc, s_dur$speech_rate_pron_sc),
                   cor(s_dur$PC3_sc, s_dur$PC1_sc),
                   cor(s_dur$PC3_sc, s_dur$PC2_sc),
                   cor(s_dur$PC3_sc, s_dur$PC3_sc),
                   cor(s_dur$PC3_sc, s_dur$prop_s),
                   cor(s_dur$PC3_sc, s_dur$prop_pl),
                   cor(s_dur$PC3_sc, s_dur$log_s_dur),
                   sqrt(summary(lm(prop_s ~ next_phon_class, data = s_dur))$r.squared),
                   sqrt(summary(lm(prop_s ~ register, data = s_dur))$r.squared),
                   cor(s_dur$prop_s, s_dur$speech_rate_pron_sc),
                   cor(s_dur$prop_s, s_dur$PC1_sc),
                   cor(s_dur$prop_s, s_dur$PC2_sc),
                   cor(s_dur$prop_s, s_dur$PC3_sc),
                   cor(s_dur$prop_s, s_dur$prop_s),
                   cor(s_dur$prop_s, s_dur$prop_pl),
                   cor(s_dur$prop_s, s_dur$log_s_dur),
                   sqrt(summary(lm(prop_pl ~ next_phon_class, data = s_dur))$r.squared),
                   sqrt(summary(lm(prop_pl ~ register, data = s_dur))$r.squared),
                   cor(s_dur$prop_pl, s_dur$speech_rate_pron_sc),
                   cor(s_dur$prop_pl, s_dur$PC1_sc),
                   cor(s_dur$prop_pl, s_dur$PC2_sc),
                   cor(s_dur$prop_pl, s_dur$PC3_sc),
                   cor(s_dur$prop_pl, s_dur$prop_s),
                   cor(s_dur$prop_pl, s_dur$prop_pl),
                   cor(s_dur$prop_pl, s_dur$log_s_dur),
                   sqrt(summary(lm(log_s_dur ~ next_phon_class, data = s_dur))$r.squared),
                   sqrt(summary(lm(log_s_dur ~ register, data = s_dur))$r.squared),
                   cor(s_dur$log_s_dur, s_dur$speech_rate_pron_sc),
                   cor(s_dur$log_s_dur, s_dur$PC1_sc),
                   cor(s_dur$log_s_dur, s_dur$PC2_sc),
                   cor(s_dur$log_s_dur, s_dur$PC3_sc),
                   cor(s_dur$log_s_dur, s_dur$prop_s),
                   cor(s_dur$log_s_dur, s_dur$prop_pl),
                   cor(s_dur$log_s_dur, s_dur$log_s_dur)
                   ), 
                  nrow = 9, ncol = 9, byrow = T, dimnames = list(
                    c("Next Phonetic Class", "Register", "Speech Rate", "Prosody 1", 
                      "Prosody 2", "Prosody 3", "Proportion(-s)", "Proportion(PL)", "log(duration(-s))"),
                    c("Next Phonetic Class", "Register", "Speech Rate", "Prosody 1", 
                      "Prosody 2", "Prosody 3", "Proportion(-s)", "Proportion(PL)", "log(duration(-s))")))
corrplot(pred_ass, method = "number")

As you can see, the predictors of interest and the covariates aren’t very correlated. However, some of the covariates are rather strongly associated with our dependent variable. In order to get a better idea of which data points support the interaction between Proportion(-s) and Proportion(PL) let’s residualize on the covariates first, and then inspect our interaction effect.

duration_model_cov = lmer(log_s_dur ~ speech_rate_pron_sc +
                            PC1_sc + PC2_sc + PC3_sc +
                            next_phon_class +
                            register +
                            (1 | speaker) + (1 | word),
                          data = s_dur)
s_dur$resid_dur = resid(duration_model_cov) 

s_dur$prop_pl_groups = factor(cut(s_dur$prop_pl, breaks = 3), labels = c("small", "average", "large"))

ggplot(s_dur, aes(x = prop_s,y = resid_dur, color = prop_pl_groups)) +
  geom_point(size = .9, alpha = .3) +
  geom_smooth(method = "lm", se = F) +
  theme_bw() +
  labs(x = "Proportion(-s)", y = "residual(duration(-s))", color = "Proportion(PL)") +
  ylim(-0.5, 0.5)

From the plot above it becomes obvious that the residuals still contain quite a lot of variance that is not explained by our interaction of interest. As a result, it is hard to see whether either the reduction or the enhancement effect is not supported by the data. Using the interactions package, we can explore this more formally by finding the Johnson-Neyman interval.

library(interactions)

jn = johnson_neyman(duration_model_trim, pred = prop_s, modx = prop_pl, plot = T)
jn$bounds
##     Lower    Higher 
## 0.3115674 0.8684140
jn$plot + xlab("Proportion(PL)") + ylab("Slope of Proportion(-s)")

This tells us that the effect of Proportion(-s) is significant if Proportion(PL) is either below 0.31 or above 0.87. In other words, the significant interaction reflects both a reduction and an enhancement effect. So what is the explanation for the reduction effect?

Secondary Analysis

First of all, we should remember from our distributional study that what Proportion(-s) represents depends on the value of Proportion(PL). At high Proportion(PL), it might be a measure of paradigmatic competition, but at low Proportion(PL), it might represent the amount of phonological support from similar paradigms. Could phonological support result in phonetic reduction? Previous research on phonological neighbourhood size suggests that this might be the case (Gahl, Yao, Johnson, 2012). Is there some way to investigate whether the durational reduction in our data is due to phonological support? We can’t include Probability(-s) and Proportion(-s) in the same model, as the two measures are strongly correlated. But we can include Probability(-s) with the residuals of Proportion(-s) from the distributional model. We can make a number of predictions if we assume that the reduction effect is due to increased phonological support.

Let’s see if these predictions are borne out:

s_dur$pred_prop_s = plogis(predict(distribution_model, newdata = s_dur))
s_dur$resid_prop_s = s_dur$pred_prop_s - s_dur$prop_s

duration_model2 = lmer(log_s_dur ~ speech_rate_pron_sc +
                         PC1_sc + PC2_sc + PC3_sc +
                         next_phon_class +
                         register +
                         p_s +
                         resid_prop_s*prop_pl +
                         (1 | speaker) + (1 | word),
                       data = s_dur)

s_dur$dur_resid = resid(duration_model2)
s_dur_trim = s_dur[abs(scale(s_dur$dur_resid)) < 2.5,]

duration_model2_trim = lmer(log_s_dur ~ speech_rate_pron_sc +
                              PC1_sc + PC2_sc + PC3_sc +
                              next_phon_class +
                              register +
                              p_s +
                              resid_prop_s*prop_pl +
                              (1 | speaker) + (1 | word),
                           data = s_dur_trim)

kable(as.matrix(summary(duration_model2_trim)$coefficients), caption = "Coefficients")
Coefficients
Estimate Std. Error df t value Pr(>|t|)
(Intercept) -2.5279653 0.0593808 179.79914 -42.5721253 0.0000000
speech_rate_pron_sc -0.1138740 0.0151382 533.21773 -7.5222865 0.0000000
PC1_sc 0.0423187 0.0137060 558.62030 3.0875930 0.0021180
PC2_sc 0.0372968 0.0132933 568.32714 2.8056772 0.0051931
PC3_sc 0.0394431 0.0134518 568.46434 2.9321790 0.0035014
next_phon_classAPP -0.1548083 0.0713879 568.48830 -2.1685528 0.0305306
next_phon_classF 0.0443955 0.0411668 565.50418 1.0784302 0.2813016
next_phon_classL -0.0959535 0.1229804 562.91140 -0.7802340 0.4355811
next_phon_classN -0.0597803 0.0697033 564.95643 -0.8576396 0.3914552
next_phon_classP -0.0605670 0.0582319 569.07741 -1.0401002 0.2987350
next_phon_classSIL 0.5514603 0.0349728 568.36772 15.7682459 0.0000000
registerstories 0.1209972 0.0329155 292.18502 3.6759980 0.0002819
registernews -0.0411585 0.0555990 66.91342 -0.7402744 0.4617231
p_s -0.1096361 0.0537727 100.11135 -2.0388804 0.0440976
resid_prop_s 0.3907298 0.1195568 63.12994 3.2681528 0.0017537
prop_pl -0.1790010 0.0785717 53.42592 -2.2781866 0.0267407
resid_prop_s:prop_pl -0.9052802 0.2461469 47.47221 -3.6778047 0.0005989

The negative coefficient for p_s shows us that increased Probability(-s) does indeed have a reduction effect. Now, let’s explore the interaction between Resid(Proportion(-s)) and Proportion(PL):

plot_model(duration_model2_trim, type = "eff", terms = c("resid_prop_s", "prop_pl[0, 0.5, 1]"), colors = "bw", legend.title = "Proportion(PL)", title = "", axis.title = c("Resid(Proportion(-s))", "log(duration(-s))"))

The cross-over interaction we see here is consistent with an account in which the residuals of the distributional model represent different aspects of variable plural production, depending on the value of Proportion(PL). At low Proportion(PL), the residuals probably represent the errors in the phonological predictions (represented by the Probability(-s) variable). At high Proportion(PL), the residuals represent the unexplained variance in Proportion(-s) due to the variation being stored.

References